Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bio] Update biomed property embeddings. #4721

Merged
merged 9 commits into from
Nov 11, 2024

Conversation

clincoln8
Copy link
Contributor

@clincoln8 clincoln8 commented Nov 7, 2024

Adds:

  • mechanismOfAction
  • hgncID
  • inChIKey

Updates:

  • unifiedMedicalLanguageSystemConceptUniqueIdentifier -> umlsConceptUniqueID
  • ncbiTaxonID->ncbiTaxId
  • referenceAlleleNCBI -> referenceAllele
  • genomicCoordinates -> hasGenomicCoordinates

Deletes:

  • diseaseName
  • observedAllele
  • hg19GenomicPosition
  • hg19GenomicLocation
  • hg38GenomicPosition
  • hg38GenomicLocation
  • hasRNATranscript
  • ncbiDNASequenceName
  • imageUrl
  • availableStrength

Example Screenshots
image
image
image

Adds:
- mechanismOfAction
- hgncID
- inChIKey

Updates:
- unifiedMedicalLanguageSystemConceptUniqueIdentifier -> umlsConceptUniqueID
- ncbiTaxonID->ncbiTaxId,
- referenceAlleleNCBI -> referenceAllele
- genomicCoordinates -> hasGenomicCoordinates

Deletes:
- diseaseName
- observedAllele
- hg19GenomicPosition
- hg19GenomicLocation
- hg38GenomicPosition
- hg38GenomicLocation
- hasRNATranscript
- ncbiDNASequenceName
- imageUrl
- availableStrength

Example: (mechanismOfAction](https://screenshot.googleplex.com/BUyvhKHv7AZuMwF)
@clincoln8 clincoln8 requested a review from chejennifer November 7, 2024 22:59
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chejennifer I'm not sure why we're seeing this diff. I'm removing the hasRNATranscript property which is related to the query, but I don't know why it's affecting entity recognition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the reason for this is we used to be reading the entity from the context (from the previous query), but we will only read from context if at least one of entity or property is detected in the current query. Since now, no property is detected, we won't use the context and no entity gets returned. We actually should remove this query from the tests since this isn't really testing anything anymore

Copy link
Contributor

@chejennifer chejennifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the reason for this is we used to be reading the entity from the context (from the previous query), but we will only read from context if at least one of entity or property is detected in the current query. Since now, no property is detected, we won't use the context and no entity gets returned. We actually should remove this query from the tests since this isn't really testing anything anymore

tools/nl/embeddings/input/bio/sheets_svs.csv Outdated Show resolved Hide resolved
tools/nl/embeddings/input/bio/sheets_svs.csv Outdated Show resolved Hide resolved
tools/nl/embeddings/input/bio/sheets_svs.csv Outdated Show resolved Hide resolved
Copy link
Contributor

@chejennifer chejennifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding!

@clincoln8 clincoln8 merged commit 3b0f5ab into datacommonsorg:master Nov 11, 2024
9 checks passed
@clincoln8 clincoln8 deleted the bio-embeddings branch November 11, 2024 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants